Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MQE: fix incorrect query results or "found duplicate series for the match group" errors when binary operation has unsorted labels in on #9482

Merged
merged 3 commits into from
Oct 1, 2024

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Oct 1, 2024

What this PR does

This PR fixes a bug in MQE's binary operation implementation where query results can be incorrect or incorrectly fail with "found duplicate series for the match group" errors.

labels.Labels.BytesWithLabels and labels.Labels.BytesWithoutLabels expect the list of label names they receive to be sorted. However, we weren't ensuring this was true when BytesWithLabels was called to generate the grouping key for a binary operation with on, so incorrect grouping keys were generated.

Which issue(s) this PR fixes or relates to

(none)

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • [n/a] about-versioning.md updated with experimental features.

…atch group" errors when binary operation has unsorted labels in `on`
@charleskorn charleskorn marked this pull request as ready for review October 1, 2024 04:18
@charleskorn charleskorn requested a review from a team as a code owner October 1, 2024 04:18
Copy link
Contributor

@jhesketh jhesketh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about having BytesWithLabels (etc) do the sort?

Granted, there are cases where we don't need to, and thus we don't want to add the overhead, so may we have a sort parameter on the function? This means we have to be explicit when using BytesWithLabels and hopefully avoid similar problems in the future

@@ -98,11 +98,21 @@ eval range from 0 to 24m step 6m left_side - on(env, pod) right_side
{env="test", pod="a"} -9 -18 -27
{env="test", pod="b"} -36 -45 -54

eval range from 0 to 24m step 6m left_side - on(pod, env) right_side
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) should add a comment here on the purpose of these tests and that the load order needs to be preserved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that the load order needs to be preserved

Not sure I follow this part - what are you referring to by "load order" here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order the series are loaded in (above on line 88) could affect this test right? It's possible they are loaded in a way where the series are already sorted after the on/etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter what order the series are in - the only thing that matters is the order of the label names passed to BytesWithLabels or BytesWithoutLabels.

(and regardless of the order of the series in the load, they'll be retrieved in sorted order due to how chunks streaming works)

@charleskorn
Copy link
Contributor Author

What do you think about having BytesWithLabels (etc) do the sort?

Granted, there are cases where we don't need to, and thus we don't want to add the overhead, so may we have a sort parameter on the function? This means we have to be explicit when using BytesWithLabels and hopefully avoid similar problems in the future

BytesWithLabels and BytesWithoutLabels is a method from Prometheus, so we'd have to convince them to add this parameter or change in behaviour. I doubt this would be accepted:

  • adding the parameter is a breaking change to both methods
  • most places that use either method call the method over and over again with the same set of labels like we do, so having BytesWithLabels or BytesWithoutLabels do the sorting would be very inefficient in comparison to having the caller do it once

Copy link
Contributor

@jhesketh jhesketh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can create a wrapper method to do the sorting if needed. But granted the number of times is an issue (and the complexity to only pass sort=true once probably outweighs the benefit).

@charleskorn charleskorn enabled auto-merge (squash) October 1, 2024 04:45
@charleskorn charleskorn merged commit e8e1e13 into main Oct 1, 2024
29 checks passed
@charleskorn charleskorn deleted the charleskorn/mqe-binop-with-unsorted-on branch October 1, 2024 05:03
@grafanabot
Copy link
Contributor

The backport to r308 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-9482-to-r308 origin/r308
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x e8e1e13dd9d186954b422406a79eeabbf2cfccaa
# Push it to GitHub
git push --set-upstream origin backport-9482-to-r308
git switch main
# Remove the local backport branch
git branch -D backport-9482-to-r308

Then, create a pull request where the base branch is r308 and the compare/head branch is backport-9482-to-r308.

@grafanabot
Copy link
Contributor

The backport to r309 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-9482-to-r309 origin/r309
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x e8e1e13dd9d186954b422406a79eeabbf2cfccaa
# Push it to GitHub
git push --set-upstream origin backport-9482-to-r309
git switch main
# Remove the local backport branch
git branch -D backport-9482-to-r309

Then, create a pull request where the base branch is r309 and the compare/head branch is backport-9482-to-r309.

grafanabot pushed a commit that referenced this pull request Oct 1, 2024
…atch group" errors when binary operation has unsorted labels in `on` (#9482)

* MQE: fix incorrect query results or "found duplicate series for the match group" errors when binary operation has unsorted labels in `on`

* Add changelog entry

* Address PR feedback: explain purpose of tests

(cherry picked from commit e8e1e13)
charleskorn added a commit that referenced this pull request Oct 1, 2024
…atch group" errors when binary operation has unsorted labels in `on` (#9482)

* MQE: fix incorrect query results or "found duplicate series for the match group" errors when binary operation has unsorted labels in `on`

* Add changelog entry

* Address PR feedback: explain purpose of tests

(cherry picked from commit e8e1e13)

# Conflicts:
#	CHANGELOG.md
charleskorn added a commit that referenced this pull request Oct 1, 2024
…atch group" errors when binary operation has unsorted labels in `on` (#9482)

* MQE: fix incorrect query results or "found duplicate series for the match group" errors when binary operation has unsorted labels in `on`

* Add changelog entry

* Address PR feedback: explain purpose of tests

(cherry picked from commit e8e1e13)

# Conflicts:
#	CHANGELOG.md
charleskorn added a commit that referenced this pull request Oct 1, 2024
…atch group" errors when binary operation has unsorted labels in `on` (#9482) (#9483)

* MQE: fix incorrect query results or "found duplicate series for the match group" errors when binary operation has unsorted labels in `on`

* Add changelog entry

* Address PR feedback: explain purpose of tests

(cherry picked from commit e8e1e13)

Co-authored-by: Charles Korn <[email protected]>
charleskorn added a commit that referenced this pull request Oct 1, 2024
…atch group" errors when binary operation has unsorted labels in `on` (#9482) (#9485)

* MQE: fix incorrect query results or "found duplicate series for the match group" errors when binary operation has unsorted labels in `on`

* Add changelog entry

* Address PR feedback: explain purpose of tests

(cherry picked from commit e8e1e13)

# Conflicts:
#	CHANGELOG.md
charleskorn added a commit that referenced this pull request Oct 1, 2024
…atch group" errors when binary operation has unsorted labels in `on` (#9482) (#9484)

* MQE: fix incorrect query results or "found duplicate series for the match group" errors when binary operation has unsorted labels in `on`

* Add changelog entry

* Address PR feedback: explain purpose of tests

(cherry picked from commit e8e1e13)

# Conflicts:
#	CHANGELOG.md
@jhesketh jhesketh mentioned this pull request Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants